253 research outputs found

    Tracking repeats using significance and transitivty.

    Get PDF
    transitivity; extreme value distribution Motivation: Internal repeats in coding sequences correspond to structural and functional units of proteins. Moreover, duplication of fragments of coding sequences is known to be a mechanism to facilitate evolution. Identification of repeats is crucial to shed light on the function and structure of proteins, and explain their evolutionary past. The task is difficult because during the course of evolution many repeats diverged beyond recognition. Results: We introduce a new method TRUST, for ab-initio determination of internal repeats in proteins. It provides an improvement in prediction quality as compared to alternative state-of-the-art methods. The increased sensitivity and accuracy of the method is achieved by exploiting the concept of transitivity of alignments. Starting from significant local suboptimal alignments, the application of transitivity allows us to: 1) identify distant repeat homologues for which no alignments were found; 2) gain confidence about consistently well-aligned regions; and 3) recognize and reduce the contribution of nonhomologous repeats. This reassessment step enables us to derive a virtually noise-free profile representing a generalized repeat with high fidelity. We also obtained superior specificity by employing rigid statistical testing for self-sequence and profile-sequence alignments. Assessment was done using a database of repeat annotations based on structural superpositioning. The results show that TRUST is a useful and reliable tool for mining tandem and non-tandem repeats in protein sequence databases, able to predict multiple repeat types with varying intervening segments within a single sequence

    Three-dimensional Ising model in the fixed-magnetization ensemble: a Monte Carlo study

    Full text link
    We study the three-dimensional Ising model at the critical point in the fixed-magnetization ensemble, by means of the recently developed geometric cluster Monte Carlo algorithm. We define a magnetic-field-like quantity in terms of microscopic spin-up and spin-down probabilities in a given configuration of neighbors. In the thermodynamic limit, the relation between this field and the magnetization reduces to the canonical relation M(h). However, for finite systems, the relation is different. We establish a close connection between this relation and the probability distribution of the magnetization of a finite-size system in the canonical ensemble.Comment: 8 pages, 2 Postscript figures, uses RevTe

    A Monte Carlo study of the triangular lattice gas with the first- and the second-neighbor exclusions

    Full text link
    We formulate a Swendsen-Wang-like version of the geometric cluster algorithm. As an application,we study the hard-core lattice gas on the triangular lattice with the first- and the second-neighbor exclusions. The data are analyzed by finite-size scaling, but the possible existence of logarithmic corrections is not considered due to the limited data. We determine the critical chemical potential as Ī¼c=1.75682(2)\mu_c=1.75682 (2) and the critical particle density as Ļc=0.180(4)\rho_c=0.180(4). The thermal and magnetic exponents yt=1.51(1)ā‰ˆ3/2y_t=1.51(1) \approx 3/2 and yh=1.8748(8)ā‰ˆ15/8y_h=1.8748 (8) \approx 15/8, estimated from Binder ratio QQ and susceptibility Ļ‡\chi, strongly support the general belief that the model is in the 4-state Potts universality class. On the other hand, the analyses of energy-like quantities yield the thermal exponent yty_t ranging from 1.440(5)1.440(5) to 1.470(5)1.470(5). These values differ significantly from the expected value 3/2, and thus imply the existence of logarithmic corrections.Comment: 4 figures 2 table

    RNA structure prediction from evolutionary patterns of nucleotide composition

    Get PDF
    Structural elements in RNA molecules have a distinct nucleotide composition, which changes gradually over evolutionary time. We discovered certain features of these compositional patterns that are shared between all RNA families. Based on this information, we developed a structure prediction method that evaluates candidate structures for a set of homologous RNAs on their ability to reproduce the patterns exhibited by biological structures. The method is named SPuNC for ā€˜Structure Prediction using Nucleotide Compositionā€™. In a performance test on a diverse set of RNA families we demonstrate that the SPuNC algorithm succeeds in selecting the most realistic structures in an ensemble. The average accuracy of top-scoring structures is significantly higher than the average accuracy of all ensemble members (improvements of more than 20% observed). In addition, a consensus structure that includes the most reliable base pairs gleaned from a set of top-scoring structures is generally more accurate than a consensus derived from the full structural ensemble. Our method achieves better accuracy than existing methods on several RNA families, including novel riboswitches and ribozymes. The results clearly show that nucleotide composition can be used to reveal the quality of RNA structures and thus the presented technique should be added to the set of prediction tools

    Graphical representations and cluster algorithms for critical points with fields

    Full text link
    A two-replica graphical representation and associated cluster algorithm is described that is applicable to ferromagnetic Ising systems with arbitrary fields. Critical points are associated with the percolation threshold of the graphical representation. Results from numerical simulations of the Ising model in a staggered field are presented. The dynamic exponent for the algorithm is measured to be less than 0.5.Comment: Revtex, 12 pages with 2 figure

    Generalized Geometric Cluster Algorithm for Fluid Simulation

    Full text link
    We present a detailed description of the generalized geometric cluster algorithm for the efficient simulation of continuum fluids. The connection with well-known cluster algorithms for lattice spin models is discussed, and an explicit full cluster decomposition is derived for a particle configuration in a fluid. We investigate a number of basic properties of the geometric cluster algorithm, including the dependence of the cluster-size distribution on density and temperature. Practical aspects of its implementation and possible extensions are discussed. The capabilities and efficiency of our approach are illustrated by means of two example studies.Comment: Accepted for publication in Phys. Rev. E. Follow-up to cond-mat/041274

    Numerical Solution of Hard-Core Mixtures

    Full text link
    We study the equilibrium phase diagram of binary mixtures of hard spheres as well as of parallel hard cubes. A superior cluster algorithm allows us to establish and to access the demixed phase for both systems and to investigate the subtle interplay between short-range depletion and long-range demixing.Comment: 4 pages, 2 figure

    Aubergene - a sensitive genome alignment tool.

    Get PDF
    Motivation: The accumulation of genome sequences will only accelerate in the coming years. We aim to use this abundance of data to improve the quality of genomic alignments and devise a method which is capable of detecting regions evolving under weak or no evolutionary constraints. Results: We describe a genome alignment program AuberGene, which explores the idea of transitivity of local alignments. Assessment of the program was done based on a 2 Mbp genomic region containing the CFTR gene of 13 species. In this region, we can identify 53% of human sequence sharing common ancestry with mouse, as compared with 44% found using the usual pairwise alignment. Between human and tetraodon 93 orthologous exons are found, as compared with 77 detected by the pairwise human-tetraodon comparison. AuberGene allows the user to (1) identify distant, previously undetected, conserved orthogonal regions such as ORFs or regulatory regions; (2) identify neutrally evolving regions in related species which are often overlooked by other alignment programs; (3) recognize false orthologous genomic regions. The increased sensitivity of the method is not obtained at the cost of reduced specificity. Our results suggest that, over the CFTR region, human shares 10% more sequence with mouse than previously thought (āˆ¼50%, instead of 40% found with the pairwise alignment). Ā© 2006 Oxford University Press

    Monte Carlo Renormalization of the 3-D Ising model: Analyticity and Convergence

    Full text link
    We review the assumptions on which the Monte Carlo renormalization technique is based, in particular the analyticity of the block spin transformations. On this basis, we select an optimized Kadanoff blocking rule in combination with the simulation of a d=3 Ising model with reduced corrections to scaling. This is achieved by including interactions with second and third neighbors. As a consequence of the improved analyticity properties, this Monte Carlo renormalization method yields a fast convergence and a high accuracy. The results for the critical exponents are y_H=2.481(1) and y_T=1.585(3).Comment: RevTeX, 4 PostScript file
    • ā€¦
    corecore